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^ : Abstract 

^ \ We study the volatility time series of 1137 most traded stocks in the US stock markets for the 

• ■ two-year period 2001-02 and analyze their return intervals r, which are time intervals between 

. volatilities above a given threshold q. We explore the probability density function of r, Pq{T), 



assuming a stretched exponential function, Pq{T) ~ e""^^. We find that the exponent 7 depends on 
the threshold in the range between 9 = 1 and 6 standard deviations of the volatility. This finding 
supports the multiscaling nature of the return interval distribution. To better understand the 
multiscaling origin, we study how 7 depends on four essential factors, capitalization, risk, number 



00 ' of trades and return. We show that 7 depends on the capitalization, risk and return but almost 

o ■ 



does not depend on the number of trades. This suggests that 7 relates to the portfolio selection but 

not on the market activity. To further characterize the multiscaling of individual stocks, we fit the 
• 1—1 . 

^ \ moments of r, /i^ = {{t/ i'^))^)^^"^ i iii the range of 10 < (r) < 100 by a power-law, ~ (t)^ ■ The 

exponent 6 is found also to depend on the capitalization, risk and return but not on the number 
of trades, and its tendency is opposite to that of 7. Moreover, we show that 6 decreases with 7 
approximately by a linear relation. The return intervals demonstrate the temporal structure of 
volatilities and our findings suggest that their multiscaling features may be helpful for portfolio 
optimization. 
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The study of volatility has long been one of the main topics of economics and econophysics 



research 
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It is important for revealing the mechanism of price dynamics 
as well as for developing strategies of investment. For example, it helps the investor to 

n Q 

estimate the risk and optimize the portfolio |6|, |7| . As a stylized fact of econophysics, the 
volatility time series has long-term power-law correlations [8, B, 10, 11, 12 1. The temporal 
structure in volatilities is complex and still regarded as an open problem. Return interval 
r, also called recurrence time or interspike interval, which is the time interval between two 



consecutive volatilities above a certain threshold q, provides a new app roach to analyze 



long-term correlated time series 



studies on financial markets 
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Recent 



2l| show that, for both daily and intraday 
data, i) the distribution of scaled interval r/ (r) can be approximated by a single scaling 
function, where (r) is the average of r. The scaling function can also be approximated by a 
stretched exponential (SE) function, ii) The sequences of the return intervals have long-term 
memory which is related to the long-term correlations in the original volatility sequences. 
Similar findings are observed for other long-term correlated time series, such as climate and 



earthquake [13|, 



first passage time 



15| . Also there are some related studies on financial markets, such as 



25| and level crossing 26 1. 



As a typical complex system, financial market is composed of many interconnected par- 
ticipants and its time series is usually not of uniscaling nature J27I. Market activity such 
as the intertrade time shows multiscaling in its distribution 28|, 1291]. Recently we suggested 



that the return intervals distribution has multiscaling characteristics based on cumulative 
distributions and moments of scaled intervals for 500 constituents of the Standard & Poor's 
500 index 2J]. The following questions are, can we detect multiscaling for a broader market? 



More important, what is the reason for multiscaling in the return intervals? Is it related 
to the market activity? Or is it connected to the portfolio selection criteria such as com- 
pany size, stock risk or return? The study of those possible relations may shed light on the 
underlined mechanism of the volatility and may help investors to optimize their portfolio. 

In this paper we analyze the volatility return intervals of the entire US stock markets. 
The database analyzed is the Trades And Quotes (TAQ) from New York Stock Exchange 
(NYSE). The period studied is from Jan 1, 2001 to Dec 31, 2002, totally 500 trading days. 
TAQ records every trade ("tick") for all securities in the US stock markets. The stock 
activity varies in a wide range, between 5 and 65, 000 trades per day. For constructing a 
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minute resolution data, one need enough records in every day and thus we choose only stocks 
that have at least 500 daily trades. With this criterion, we obtain 1137 stocks which are the 
most traded in the market. From tick prices we set the closest one to a minute mark as the 



181 ]. First, we compute 



price at that minute. The volatility is defined the same as in Ref 
the absolute value of the logarithmic change of the minute price, then remove the intraday 
U-shape pattern, and finally normalize the series with its standard deviation. Therefore the 
volatility is in units of standard deviations. Since the sampling time is 1 minute, a trading 
day has 390 points (after removing the market closing hours), and each stock has about 
195,000 records. 

The analysis with respect to several essential factors is widely used in economics studies. 
For instance, company size, market return and book-to-market value are used to model asset 
pricing 30]. Volatilities and therefore return intervals may be affected by many factors. Here 



we study how the return intervals distribution depends on a few essential measures which 
characterize different features of the stocks. The first one is the size of company, which is a 
popular criterion for portfolio selection. Stocks of different scales are preferred by investors 
of different types. The size also limits the group of investors and market depth for a stock. 
On the other hand, the internal organization of a company might dramatically varies with 
its size. Thus, the volatility and its return interval may be strongly influenced by this factor. 
The size is usually characterized by the market capitalization, product of the stock price 
and outstanding shares. Without loss of generality, we choose the price and outstanding 
shares on Dec 31, 2002 to calculate the capitalization. For the 1137 stocks, the range of 
capitalization is between 2 x 10^ and 2 x 10^^ dollars. 

The reward and risk are basic concerns for any investment and we therefore choose them 
as the next two factors. The reward is usually measured as the average return of price while 
risk is measured as the standard deviation of the return {si]]. This traditional deflnition of 
the risk is based on the Gaussian distribution of the time series, which is not always adapted 
to the flnancial data Q]. Nevertheless, it characterizes the magnitude of fluctuations and 
therefore the risk. To avoid the intraday pattern , we calculate the return on a daily 

basis. The return is the logarithmic daily price change averaged over the two-year period 
(2001-2002), which varies from -0.008 to 0.004 for the 1137 stocks. The risk, standard 
deviation of daily returns in the two years, ranges from 0.012 to 0.12. The fourth factor 
we study is an activity measure, the number of trades per day. Note that the four factors 
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reflect different aspects of a stock. The size is for the scale of company. The return and risk 
are historical price movement tendency and variation, which are helpful for the prediction 
of future price change. While the number of trades shows the activity, i.e., how frequent a 
stock is traded. 

For a volatility time series, we choose a positive value as the threshold q and find those 
volatilities above g, which are called "events" . Note that q is in units of standard deviations. 
Then we calculate the time intervals r between two consecutive events and compose a new 
time series. For each threshold q we have a corresponding time series of return intervals. 
7or financial markets, the PDF of r, Pq{T), is well- approximated by the scaling function 

P,{r) = -^J{r/{r)), (1) 

where (■) stands for the average over the data set. The scaling function f{x), where x 
corresponds to the scaled interval r/ (r), can be approximated by a SE function. 



fix) 



ce 



(2) 



in consistent with other long-term correlated records 13|, |l6| . The normalization constant 



c and the sca^ 



parameter 16 



ed parameter a depend on the exponent 7, and thus f{x) has only one free 



24| . When the record has no long-term correlations, the return intervals 



follow as expected an exponential distribution, i.e. 7=1. As an example, we plot in Fig. [T] 
the PDFs of return intervals for a typical stock. General Electric (GE). The PDFs for four 
values of g (g = 2 to 5) almost collapse onto a single curve. We also plot a SE (Eq. ([2]) 
fitting of the curve for q = 2. For small values of r, there are some deviations from the SE 
function. Eichner et al. suggested that the scaling function is characterized by a power-law 
function for short time scales and a SE function for long time scales 22|. To avoid these 
deviations, we analyze the scaling function only for large scales {t / {t) > 0.1). 

In a recent paper 2J], indications of deviations from the scaling function Eq. ([2]) were 
observed for the return intervals. The cumulative distributions for different thresholds q 
were found to systematically deviate from a single scaling function 2J]. This indicates that 
the exponent 7 may change with the threshold q. To test this assumption quantitatively 
and over the entire market, we compute 7 for all 1137 stocks and plot in Fig[2]their averages 
and standard deviations (as error bars) as a function of q. The values of 7 are obtained from 
the least-squares fit of the scaling function, Eq. ([2]), to the data for the range r/(r) > 0.1 



(see Fig. [T]). The range of q studied is from 1 to 6 with steps of 0.25. We consider a 
point as on outher if its RMS error is larger than 10%. Totally 730 out of 22740 points 
or 3.8% of all points are removed. Fig. [2] shows that the mean 7 decreases with g, from 
0.49 for g = 1 to 0.28 for g = 3. For large thresholds (between g = 3 and 6) 7 tends to be 
constant (around 0.26), where the distribution can be regarded as close to be of uniscaling 
nature. The difference in 7 between small and large thresholds suggests multiscaling in 
the distribution for the whole range. The volatility time series has long-term correlations, 
which can be characterized by the exponent a obtained from Detrended Fluctuation Analysis 
(DFA) method [sl, 18, 32]. Assuming the validity of the relation between 7 in Eq. ([2]) 



and the long-term correlations in the volatilities [2^], a = 1 — 7/2, it follows that small 
volatilities have large 7 and weak correlations, while large volatilities have small 7 and 
strong correlations. Large volatilities correspond to long time scales and small volatilities 
correspond to short time scales. The changes in the value of 7 seen in Fig. |2] might be due to 
the changes in the a found between short and long time scales in the volatility records js, 18|. 
The error bars shown in Fig. [2] are limited for all thresholds, which indicates the tendency 
is consistent for the entire market. Note that error bars for several largest thresholds such 
as g = 6 and 5.75 are slightly larger, probably due to the bad statistics of fewer events. 

Next we study the relations between 7 and the four essential factors, market capital- 
ization, risk, number of trades and return. This tests the universality of 7 over the entire 
market. If 7 is sensitive to some factors, the market as one system is not of uniscaling. Fur- 
thermore, the dependence (if exist) may indicate some origins for the multiscaling found in 
return interval distributions. In Fig. [3], we plot 7 against the four factors for four thresholds, 
g = 2, 3, 4 and 5. In each panel, curves have similar tendency and the value of 7 decreases 
with g. Note that the curves are closer to each other for large thresholds. This finding is 
consistent with Fig. [21 which shows that the mean 7 decreases with g and reaches almost 
a constant value for large g. More important. Fig. [3] exhibits that 7 for a given threshold 
is not uniformly distributed with the factor values and thus the market is of multiscaling 
nature. 

For the company size (Fig. [3t^a)), 7 increases for sizes between 5 x 10'' to 2 x 10^° dollars 
and then shows a slight decrease. The market depth for small companies limits the size 
of investors and those companies usually attract some specific types of investors. There- 
fore corresponding strategies may be relatively similar and the volatility series tends to be 
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strongly correlated having a small 7 [13|. With increasing size, more investors are involved, 
which may "randomize" the long-term correlations in volatilities. When the company size 
reaches a certain limit, the constitution of investor types may be relatively stable, some 
common modes might dominate volatilities and therefore the correlations become stronger 
and 7 decreases with the size. 

Fig. [3](b) shows that 7 decreases with the risk except for very low risks. Fig. [2] shows 
that larger volatilities tend to have smaller 7. Larger risk means that the probability of 
larger volatilities is higher. Therefore, Fig. [3](b) is consistent with Fig. [2l Price movement 
is realized by trades and the temporal structure of the volatility probably relates to the size 
of trades. Counterintuitively, 7 is almost not sensitive to the market activity. Fig. P(c) 



33|, see also [281]. A 



suggests no apparent dependence between 7 and the number of trades 
possibly reason is that many investors do not change their strategies only because of the 
dramatic change of trading frequency. Next we show in Fig. [3](d) the relation between 7 
and the return. For negative returns 7 increases and decreases for positive returns. It has a 
maximum when the return is 0. This behavior suggests that the return is related to the size 
of risk. For returns with large magnitude representing high volatilities, the corresponding 
risk is relatively high and therefore 7 is small (see Fig. [3]^b)). 

Next we study the multiscaling behavior of individual stocks. The moments of scaled 
interval, t/{t), can quantify the deviations from a single scaling function and therefore 
provide a good measure to test the multiscaling in individual stocks. In Fig. HI we plot /i^ 
for GE as an example. The moment and the corresponding exponent S for the multiscaling 
are defined as 

/in. = ((r/(r))™)i/™ ~ {ry. (3) 

If the distribution of the return intervals follows a unique scaling law as Eq. ([T]), the 
different moments should be independent on (r) and therefore the exponent 6 should be 
0. A significant 6 suggests multiscaling, and the value of S characterizes the strength of 



the multiscaling 2j], thus we call 6 multiscaling exponent 34[- We find that fim changes 
systematically with (r). For m > 2, moments first increase with (r) and then decrease 24] 
(for a typical example, see Fig. HI). Since a value of (r) corresponds to a threshold value q, 
the moments have the same trend with q. Here we choose four typical orders for moments, 
m = 2, 4, 8 and 16. For other positive orders, we find similar behaviors. For these four 
orders, all 1137 stocks totally have 215 out of 4548 cases (4.7%) where the RMS error of 



fitting are over 22%, and are not included in the analysis. 

Now we focus on the relation between the multiscaling exponent 6 and the four factors. 
We plot in Fig. Othe curves for four orders, m = 2, 4, 8 and 16. These curves have the similar 
tendency in each panel. The value of 6 increases a little from m = 2 to 4, then decreases, 
which is consistent with the result in Ref |2^]. As shown in Fig. M^a), S decreases with the 
capitalization until about 2 x 10^° dollars and then the curves increase. This suggests that 

6 also relates to the constitution of investors. A small company has few investors which 
have some specific strategies. However, if the company is very large, some types of investors 
finally dominate the price movement. In Fig. [5](b), 6 increases almost monotonically with 
the risk, indicating that if a stock has larger volatility values, its return interval distribution 
has stronger multiscaling effect. Similar to 7, 6 is almost independent on the number of 
trades as shown in Fig. [5](c). In Fig. [5](d), 6 has a minimum at zero returns, which also 
agrees with the relation between 6 and risk. 

There are clear connections between Fig. [3] and Fig [5l which indicate that 7 and S are 
strongly related. From Fig. [21 7 decreases with g, a g value corresponds to a (r) value, and 5 
is the power-law fitting exponent for the moment vs. (r). To examine the relation between 
the two exponents we plot 6 against 7 in Fig. [61 Our results suggest that 6 decreases with 

7 for all four thresholds g = 2, 3, 4 and 5 when m = 2. These curves approximately follow a 
linear function as guided by the dashed lines with the slope —0.63, —0.75, —0.74 and —0.62 
respectively. Other thresholds q and orders m show similar results. The smaller is the value 
of 7, the larger deviation from a single scaling function for the return interval distribution 
is observed. 

The SE exponent 7 characterizes the return intervals, which depend on the temporal 
structure of volatility time series. In other words, 7 characterizes the dynamic property 
of volatility. Capitalization, risk and return are fund mental measures of a company while 
number of trades is for the market activity, which is due to the market participants and not 
influenced by the company managers. We show that 7 relates to these fundamental measures 
but not to the activity. We also test the relation between 7 and share volume, and find no 
clear dependence, similar to that for number of trades. Although there is a certain relation 
between those measures, for instance, the number of trades depends on the capitalization 



29|, it does not guarantee that 7 depend on the number of trades. For a company of a 



given number of trades, its capitalization has a range of values, and for a company of a 



7 



given capitalization, its 7 also distributes in a certain interval. Since there is a crossover 
in the curve of 7 and capitalization, it is possible that 7 is not sensitive to the number of 
trades. Capitalization, risk and return are widely used for building portfolio. Therefore, 7 
connects the dynamic structure of the price movement with fundamental measures, which 
may provides an helpful indicator for portfolio selection. Similarly, the multiscaling exponent 
6 also could be used to optimize the portfolio. Recently Bogachev et al. studied return 
intervals in multifractal data sets and suggested that the return interval follows a power-law 
distribution 23|. Meanwhile, Livina et al. suggested a Gamma distribution for earthquake 
time series which also has long-term correlations jl^ . Therefore a detailed analysis on the 
distribution function is needed. 

In summary, we analyzed the volatility return interval for 1137 most traded stocks in the 
United States markets. We have shown that the SE exponent 7 depends on the threshold q, 
which supports multiscaling nature in the return interval distribution. We also studied the 
relation between 7 and four essential factors of stock, capitalization, risk, number of trades 
and return. We found that 7 depends on the capitalization, risk and return but not on the 
number of trades, which suggests the multiscaling in the entire market. We further analyzed 
the multiscaling exponent 6, which characterizes the multiscaling of individual stocks. We 
found that it again depends on the capitalization, risk and return but not on the number of 
trades. Our results suggest that 6 and 7 may be useful for portfolio optimization. 

We thank S.-J. Shieh, R. Mantegna, J. Kertesz and Z. Eisler for helpful discussions, and 
the NSF and Merck Foundation for financial support. 
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FIG. 1: (Color online) Return interval PDFs of four thresholds, g = 2, 3, 4 and 5 for the GE stock. 
These four curves approximately collapse onto a single one, and the scaling function is approximate 
stretched exponential, as guided by the black curve which is the SE fitting to the data for q = 2 
(shifted vertically for better visibility). 
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FIG. 2: (Color online) SE exponent 7 vs. threshold q. The filled circles are the values of 7 averaged 
over 1137 most traded stocks and the error bars are the corresponding standard deviations. The 
dashed line is a guide line of 7 = 0.26. 
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FIG. 3: (Color online) Relation between SE exponent 7 and four factors: (a) market capitalization, 
(b) risk, the standard deviation of daily return, (c) average daily number of trades and (d) average 
daily return. Curves of four thresholds (7 = 2, 3, 4 and 5 are demonstrated. Dashed lines are 
logarithmic fittings (except for (d) where the fitting is linear) on the curve oi q = 2. 
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FIG. 4: (Color online) Typcial moments of the GE stock. Four orders, m = 2, 4, 8 and 16 
are shown. Dashed lines are power-law fittings in the range of 10 < (r) < 100 for determining the 
multiscaling exponent 5. 
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FIG. 5: (Color online) Relation between multiscaling exponent 6 and four factors: (a) market 
capitalization, (b) risk, the standard deviation of daily return, (c) average daily number of trades 
and (d) average daily return. Curves of four moments, m = 2, 4, 8 and 16 are shown. Dashed lines 
are fittings on the curve of m = 16 which demonstrate the tendency. 
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FIG. 6: (Color online) Multiscaling exponent 6 vs. SE exponent 7. The values of 6 are for order 
m = 2 and 7 are for thresholds g = 2, 3, 4 and 5. Linear fittings are shown by dashed lines. 
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